Sentence Extraction by Spreading Activation with Refined Similarity Measure
نویسندگان
چکیده
Although there has been a great deal of research on automatic summarization, most methods are based on a statistical approach, disregarding relationships between extracted textual segments. To ensure sentence connectivity, we propose a novel method to extract a set of comprehensible sentences that centers on several key points. This method generates a similarity network from documents with a lexical dictionary and applies spreading activation to rank sentences. Also, we show evaluation results of a multi-document summarization system based on the method, participating in a competition of summarization, TSC (Text Summarization Challenge) task organized by the third NTCIR (NII-NACSIS Test Collection for IR Systems) project.
منابع مشابه
iSpreadRank: Ranking sentences for extraction-based summarization using feature weight propagation in the sentence similarity network
Sentence extraction is a widely adopted text summarization technique where the most important sentences are extracted from document(s) and presented as a summary. The first step towards sentence extraction is to rank sentences in order of importance as in the summary. This paper proposes a novel graph-based ranking method, iSpreadRank, to perform this task. iSpreadRank models a set of topic-rel...
متن کاملمقایسه روشهای مختلف یادگیری ماشین در خلاصهسازی استخراجی گفتار به گفتار فارسی بدون استفاده از رونوشت
In this paper, extractive speech summarization using different machine learning algorithms was investigated. The task of Speech summarization deals with extracting important and salient segments from speech in order to access, search, extract and browse speech files easier and in a less costly manner. In this paper, a new method for speech summarization without using automatic speech recognitio...
متن کاملKnotenähnlichkeiten aus Aktivierungsausbreitungen
This thesis discusses how networks can be searched via methods of spreading activation. Networks, often represented as graphs, are datasets consisting of data objects (nodes or vertices) and the relations among them (edges). In many areas, e.g. information retrieval, networks are searched according to specific queries for related, interesting and relevant data objects, by applying methods such ...
متن کاملDocument Summarization Retrieval System Based on Web User Needs
Existing models for document summarization mostly use the similarity between sentences in the document to extract the most salient sentences. The documents as well as the sentences are indexed using traditional term indexing measures, which do not take the context into consideration. Therefore, the sentence similarity values remain independent of the context. In this paper, we propose a context...
متن کاملA Geometric View of Similarity Measures in Data Mining
The main objective of data mining is to acquire information from a set of data for prospect applications using a measure. The concerning issue is that one often has to deal with large scale data. Several dimensionality reduction techniques like various feature extraction methods have been developed to resolve the issue. However, the geometric view of the applied measure, as an additional consid...
متن کامل